Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

نویسندگان

  • Björn W. Schuller
  • Martin Wöllmer
  • Tobias Moosmayr
  • Gerhard Rigoll
چکیده

Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, speech modelling, and training. Thereby, we give an overview of promising auditory modelling concepts, speech enhancement techniques, training strategies, and model architecture, which are implemented in an in-car digit and spelling recognition task considering noises produced by various car types and driving conditions. We prove that joint speech and noise modelling with a Switching Linear Dynamic Model (SLDM) outperforms speech enhancement techniques like Histogram Equalisation (HEQ) with a mean relative error reduction of 52.7% over various noise types and levels. Embedding a Switching Linear Dynamical System (SLDS) into a Switching Autoregressive Hidden Markov Model (SAR-HMM) prevails for speech disturbed by additive white Gaussian noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Robust Speech Recognition for Adverse Environments

As the state-of-the-art speech recognizers can achieve a very high recognition rate for clean speech, the recognition performance generally degrades drastically under noisy environments. Noise-robust speech recognition has become an important task for speech recognition in adverse environments. Recent research on noise-robust speech recognition mostly focused on two directions: (1) removing the...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Audio, Speech and Music Processing

دوره 2009  شماره 

صفحات  -

تاریخ انتشار 2009